Estimating kinship in admixed populations.
نویسندگان
چکیده
Genome-wide association studies (GWASs) are commonly used for the mapping of genetic loci that influence complex traits. A problem that is often encountered in both population-based and family-based GWASs is that of identifying cryptic relatedness and population stratification because it is well known that failure to appropriately account for both pedigree and population structure can lead to spurious association. A number of methods have been proposed for identifying relatives in samples from homogeneous populations. A strong assumption of population homogeneity, however, is often untenable, and many GWASs include samples from structured populations. Here, we consider the problem of estimating relatedness in structured populations with admixed ancestry. We propose a method, REAP (relatedness estimation in admixed populations), for robust estimation of identity by descent (IBD)-sharing probabilities and kinship coefficients in admixed populations. REAP appropriately accounts for population structure and ancestry-related assortative mating by using individual-specific allele frequencies at SNPs that are calculated on the basis of ancestry derived from whole-genome analysis. In simulation studies with related individuals and admixture from highly divergent populations, we demonstrate that REAP gives accurate IBD-sharing probabilities and kinship coefficients. We apply REAP to the Mexican Americans in Los Angeles, California (MXL) population sample of release 3 of phase III of the International Haplotype Map Project; in this sample, we identify third- and fourth-degree relatives who have not previously been reported. We also apply REAP to the African American and Hispanic samples from the Women's Health Initiative SNP Health Association Resource (WHI-SHARe) study, in which hundreds of pairs of cryptically related individuals have been identified.
منابع مشابه
Estimating relationships between phenotypes and subjects drawn from admixed families
BACKGROUND Estimating relationships among subjects in a sample, within family structures or caused by population substructure, is complicated in admixed populations. Inaccurate allele frequencies can bias both kinship estimates and tests for association between subjects and a phenotype. We analyzed the simulated and real family data from Genetic Analysis Workshop 19, and were aware of the simul...
متن کاملEstimation of kinship coefficient in structured and admixed populations using sparse sequencing data
Knowledge of biological relatedness between samples is important for many genetic studies. In large-scale human genetic association studies, the estimated kinship is used to remove cryptic relatedness, control for family structure, and estimate trait heritability. However, estimation of kinship is challenging for sparse sequencing data, such as those from off-target regions in target sequencing...
متن کاملDetermining Ancestry Proportions in Complex Admixture Scenarios in South Africa Using a Novel Proxy Ancestry Selection Method
UNLABELLED Admixed populations can make an important contribution to the discovery of disease susceptibility genes if the parental populations exhibit substantial variation in susceptibility. Admixture mapping has been used successfully, but is not designed to cope with populations that have more than two or three ancestral populations. The inference of admixture proportions and local ancestry ...
متن کاملAnalysis of Population Substructure in Two Sympatric Populations of Gran Chaco, Argentina
Sub-population structure and intricate kinship dynamics might introduce biases in molecular anthropology studies and could invalidate the efforts to understand diseases in highly admixed populations. In order to clarify the previously observed distribution pattern and morbidity of Chagas disease in Gran Chaco, Argentina, we studied two populations (Wichí and Criollos) recruited following an inn...
متن کاملAssessing individual interethnic admixture and population substructure using a 48-insertion-deletion (INSEL) ancestry-informative marker (AIM) panel.
Estimating the proportions of different ancestries in admixed populations is very important in population genetics studies, and it is particularly important for detecting population substructure effects in case-control association studies. In this work, a set of 48 ancestry-informative insertion-deletion polymorphisms (INDELs) were selected with the goal of efficiently measuring the proportions...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- American journal of human genetics
دوره 91 1 شماره
صفحات -
تاریخ انتشار 2012